Improving scalability of Bag-of-Tasks applications running on master-slave platforms

نویسندگان

  • Fabrício Alves Barbosa da Silva
  • Hermes Senger
چکیده

0167-8191/$ see front matter 2008 Elsevier B.V doi:10.1016/j.parco.2008.09.013 * Corresponding author. Tel.: +351 217 500 244; E-mail address: [email protected] (F.A.B. da Silv 1 In this paper, we use the terms ‘‘Bag-of-Tasks” a Bag-of-Tasks applications are parallel applications composed of independent tasks. Examples of Bag-of-Tasks (BoT) applications include Monte Carlo simulations, massive searches (such as key breaking), image manipulation applications and data mining algorithms. This paper analyzes the scalability of Bag-of-Tasks applications running on master–slave platforms and proposes a scalability-related measure dubbed input file affinity. In this work, we also illustrate how the input file affinity, which is a characteristic of an application, can be used to improve the scalability of Bag-of-Tasks applications running on master– slave platforms. The input file affinity was considered in a new scheduling algorithm dubbed Dynamic Clustering, which is oblivious to task execution times. We compare the scalability of the Dynamic Clustering algorithm to several other algorithms, oblivious and non-oblivious to task execution times, proposed in the literature. We show in this paper that, in several situations, the oblivious algorithm Dynamic Clustering has scalability performance comparable to non-oblivious algorithms, which is remarkable considering that our oblivious algorithm uses much less information to schedule tasks. 2008 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Scheduling Algorithm for Running Bag-of-Tasks Data Mining Applications on the Grid

Data mining applications are composed of computing-intensive processing tasks, which are natural candidates for execution on high performance, high throughput platforms such as PC clusters and computational grids. Besides, some data-mining algorithms can be implemented as Bag-of-Tasks (BoT) applications, which are composed of parallel, independent tasks. Due to its own nature, the adaptation of...

متن کامل

Agent-based hierarchical approach for executing bag-of-tasks in clouds

Unlike message-passing applications, “bag-of-tasks” applications (BoTs), whose tasks are unrelated and independent (no inter-task communication), can be highly parallelized and executed in any acceptable order. A common practice when executing bag-of-tasks applications (BoT) is to exploit the master-slave topology. Cloud environments offer some features that facilitate the execution of BoT appl...

متن کامل

PvmJobs: A Generic Parallel Jobs Library for PVM

PvmJobs is a general bag-of-jobs library for PVM that works with any user created job structure in a master/slave paradigm. A master can spawn slave processes, schedule and dispatch jobs to slaves, coordinate and synchronize the activities. A slave process obtains a job from the master, performs a set of prescribed tasks, returns results to the master, and obtains the next job. Slaves are organ...

متن کامل

Algorithmic and Scheduling Techniques for Heterogeneous and Distributed Computing

The computing and communication resources of high performance computing systems are becoming heterogeneous, are exhibiting performance fluctuations and are failing in an unforeseeable manner. The Master-Slave (MS) paradigm, that decomposes the computational load into independent tasks, is well-suited for operating in these environments due to its loose synchronization requirements. The applicat...

متن کامل

Scheduling Algorithms for Data Redistribution and Load-Balancing on Master-Slave Platforms

In this work we are interested in the problem of scheduling and redistributing data on master-slave platforms. We consider the case were the workers possess initial loads, some of which having to be redistributed in order to balance their completion times. We assume that the data consists of independent and identical tasks. We prove the NP completeness of the problem for fully heterogeneous pla...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Parallel Computing

دوره 35  شماره 

صفحات  -

تاریخ انتشار 2009